938 research outputs found

    Text Document Classification: An Approach Based on Indexing

    Get PDF
    ABSTRACT In this paper we propose a new method of classifying text documents. Unlike conventional vector space models, the proposed method preserves the sequence of term occurrence in a document. The term sequence is effectively preserved with the help of a novel datastructure called ‘Status Matrix’. Further the corresponding classification technique has been proposed for efficient classification of text documents. In addition, in order to avoid sequential matching during classification, we propose to index the terms in Btree, an efficient index scheme. Each term in B-tree is associated with a list of class labels of those documents which contain the term. Further the corresponding classification technique has been proposed. To corroborate the efficacy of the proposed representation and status matrix based classification, we have conducted extensive experiments on various datasets. Original Source URL : http://aircconline.com/ijdkp/V2N1/2112ijdkp04.pdf For more details : http://airccse.org/journal/ijdkp/vol2.htm

    A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents

    Get PDF
    Selection of highly discriminative feature in text document plays a major challenging role in categorization. Feature selection is an important task that involves dimensionality reduction of feature matrix, which in turn enhances the performance of categorization. This article presents a new feature selection method based on Intuitionistic Fuzzy Entropy (IFE) for Text Categorization. Firstly, Intuitionistic Fuzzy C-Means (IFCM) clustering method is employed to compute the intuitionistic membership values. The computed intuitionistic membership values are used to estimate intuitionistic fuzzy entropy via Match degree. Further, features with lower entropy values are selected to categorize the text documents. To find the efficacy of the proposed method, experiments are conducted on three standard benchmark datasets using three classifiers. F-measure is used to assess the performance of the classifiers. The proposed method shows impressive results as compared to other well known feature selection methods. Moreover, Intuitionistic Fuzzy Set (IFS) property addresses the uncertainty limitations of traditional fuzzy set

    Automatic Irony Detection using Feature Fusion and Ensemble Classifier

    Get PDF
    With the advent of micro-blogging sites, users are pioneer in expressing their sentiments and emotions on global issues through text. Automatic detection and classification of sentiments like sarcastic or ironic content in microblogging reviews is a challenging task. It requires a system that manages some kind of knowledge to interpret the sentiment expressed in text. The available approaches are quite limited in their capabilities and scope to detect ironic utterances present in the text. In this regards, the paper propose feature fusion to provide knowledge to the system by alternative sets of features obtained using linguistic and content based text features. The proposed work extracts five sets of linguistic features and fuses with features selected using two stages of a feature selection method. In order to demonstrate the effectiveness of the proposed method, we conduct extensive experimentation by selecting different feature subsets. The performances of the proposed method are evaluated using Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Decision Tree (DT) and ensemble classifiers. The experimental result shows the proposed approach significantly out-performs the conventional methods

    Anomaly based Intrusion Detection using Modified Fuzzy Clustering

    Get PDF
    This paper presents a network anomaly detection method based on fuzzy clustering. Computer security has become an increasingly vital field in computer science in response to the proliferation of private sensitive information. As a result, Intrusion Detection System has become an indispensable component of computer security. The proposed method consists of three steps: Pre-Processing, Feature Selection and Clustering. In pre-processing step, the duplicate samples are eliminated from the sample set. Next, principal component analysis is adopted to select the most discriminative features. In clustering step, the network samples are clustered using Robust Spatial Kernel Fuzzy C-Means (RSKFCM) algorithm. RSKFCM is a variant of traditional Fuzzy C-Means which considers the neighbourhood membership information and uses kernel distance metric. To evaluate the proposed method, we conducted experiments on standard dataset and compared the results with state-of-the-art methods. We used cluster validity indices, accuracy and false positive rate as performance metrics. Experimental results inferred that, the proposed method achieves better results compared to other methods

    Automated ECG Analysis for Localizing Thrombus in Culprit Artery Using Rule Based Information Fuzzy Network

    Get PDF
    Cardio-vascular diseases are one of the foremost causes of mortality in today’s world. The prognosis for cardiovascular diseases is usually done by ECG signal, which is a simple 12-lead Electrocardiogram (ECG) that gives complete information about the function of the heart including the amplitude and time interval of P-QRST-U segment. This article recommends a novel approach to identify the location of thrombus in culprit artery using the Information Fuzzy Network (IFN). Information Fuzzy Network, being a supervised machine learning technique, takes known evidences based on rules to create a predicted classification model with thrombus location obtained from the vast input ECG data. These rules are well-defined procedures for selecting hypothesis that best fits a set of observations. Results illustrate that the recommended approach yields an accurateness of 92.30%. This novel approach is shown to be a viable ECG analysis approach for identifying the culprit artery and thus localizing the thrombus

    A Convolution Neural Network Engine for Sclera Recognition

    Get PDF
    The world is shifting to the digital era in an enormous pace. This rise in the digital technology has created plenty of applications in the digital space, which demands a secured environment for transacting and authenticating the genuineness of end users. Biometric systems and its applications has seen great potentials in its usability in the tech industries. Among various biometric traits, sclera trait is attracting researchers from experimenting and exploring its characteristics for recognition systems. This paper, which is first of its kind, explores the power of Convolution Neural Network (CNN) for sclera recognition by developing a neural model that trains its neural engine for a recognition system. To do so, the proposed work uses the standard benchmark dataset called Sclera Segmentation and Recognition Benchmarking Competition (SSRBC 2015) dataset, which comprises of 734 images which are captured at different viewing angles from 30 different classes. The proposed methodology results showcases the potential of neural learning towards sclera recognition system

    Multilayer Feedforward Neural Network for Internet Traffic Classification

    Get PDF
    Recently, the efficient internet traffic classification has gained attention in order to improve service quality in IP networks. But the problem with the existing solutions is to handle the imbalanced dataset which has high uneven distribution of flows between the classes. In this paper, we propose a multilayer feedforward neural network architecture to handle the high imbalanced dataset. In the proposed model, we used a variation of multilayer perceptron with 4 hidden layers (called as mountain mirror networks) which does the feature transformation effectively. To check the efficacy of the proposed model, we used Cambridge dataset which consists of 248 features spread across 10 classes. Experimentation is carried out for two variants of the same dataset which is a standard one and a derived subset. The proposed model achieved an accuracy of 99.08% for highly imbalanced dataset (standard)

    Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method

    Get PDF
    Social Networking sites have become popular and common places for sharing wide range of emotions through short texts. These emotions include happiness, sadness, anxiety, fear, etc. Analyzing short texts helps in identifying the sentiment expressed by the crowd. Sentiment Analysis on IMDb movie reviews identifies the overall sentiment or opinion expressed by a reviewer towards a movie. Many researchers are working on pruning the sentiment analysis model that clearly identifies and distinguishes between a positive review and a negative review. In the proposed work, we show that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results both in terms of accuracy and complexity when tested against classifiers like SVM, Naïve Bayes, KNN and Maximum Entropy. The proposed model clearly differentiates between a positive review and negative review. Since understanding the context of the reviews plays an important role in classification, using hybrid features helps in capturing the context of the movie reviews and hence increases the accuracy of classification


    Get PDF
    Successful cash management is essential to both corporate and personal financial success. A review of the main ideas, tactics, and advantages of cash management is given in this abstract. In order to maintain economic security, liquidity, and development, effective cash management which includes everything from transaction volume to long-term investments is essential. This abstract examines a number of aspects of cash management, such as anticipating cash flows, managing liquidity, and streamlining the process with technology and financial instruments. It drives into detail about how important it is to keep a sufficient cash balance to cover current expenses while optimising return on surplus funds. In addition to helping individuals and organizations take advantage of financial opportunities like investments, purchases, and debt

    Role of Nitrogen Source for L-Glutaminase Production from Fungal Strain using through Submerged Fermentation

    Get PDF
    L-glutaminase has attracted much attention due its wide range of applications in several fields. The L-glutaminase widely used in pharmaceutical and food industries. L-glutaminase is generally regarded as a key enzyme that controls the delicious taste of fermented foods such as soy sauce. L-glutaminase production was carried out by using supplementation of organic and inorganic nitrogen sources such as yeast extract, malt extract, peptone and urea at concentration ranging from 0.25% to 1.25% with increments of 0.25% and also different inorganic nitrogen sources like ammonium sulphate and ammonium chloride at concentration ranging from 0.025% to 0.125% with increments of 0.025%. The malt extract (1%) produced 399.9 IU, were best organic nitrogen source and ammonium sulphate (0.1%) appear to be good inorganic nitrogen source under submerged fermentation process and showed 546 IU. Current study is an exploring step to industrial sector to upscale their L-glutaminase production and it will useful strategy to commercial sector and alternative to old method